This proof of concept (POC) demonstrates the feasibility of offloading audio effect processing to a backend while preserving uninterrupted playback on the client. A Vite React frontend is responsible for handling user interactions and audio controls, while an Express server running ffmpeg applies the actual digital signal processing. By reducing the computational burden on the client, this architecture maintains smooth playback and provides additional capacity for more sophisticated audio transformations in the future.
Core features include real-time visualizations, dynamic crossfades between original and processed audio, and public file hosting via Supabase. Together, these elements illustrate how advanced audio rendering can be achieved without compromising frontend responsiveness.
To enhance the user experience, a custom visualizer was developed using AnalyserNode
from the Web Audio API. This module captures frequency data in real time, which is then rendered onto an HTML5 canvas or SVG-based visualization. The visualizer makes it easy to observe how changes to the lowpass filter impact the frequency spectrum of the current track.
The design uses a logarithmic frequency scale, matching how human hearing interprets pitch. Users can click and drag a handle to adjust the cutoff frequency, and the visual curve updates automatically to reflect the new value. This interactive feedback loop helps illustrate exactly how the filter shapes the audio content.
The initial version relied on AudioContext and a BiquadFilterNode in JavaScript to perform a lowpass filter directly in the browser. This approach was helpful for rapid prototyping and verified that the general concept of real-time audio manipulation was viable. Implementing the filter was straightforward:
filterNode.type = 'lowpass';
filterNode.frequency.value = newFrequency;
While a JavaScript-based filter is effective for simple scenarios, it can introduce performance challenges when scaling to more advanced effects or additional processing tasks. This limitation sparked the decision to move heavier operations to the server.
To address the performance concerns inherent in client-side processing, the system was adapted to perform the lowpass effect with ffmpeg in an Express backend. When the user changes the filter frequency, the application:
ffmpeg
process.const ffmpeg = spawn('ffmpeg', [
'-i', inputFilePath,
'-af', `lowpass=f=${filterFrequency}`,
processedFilePath,
]);
By delegating computationally intensive tasks to the server, the frontend remains responsive, enabling more advanced user interactions without risking dropout or delays during playback.
An additional refinement to the user experience involves smoothly animating between old and new waveforms once the server processes audio. Since ffmpeg
often removes or attenuates certain frequencies, the resulting waveform can have a lower amplitude (i.e., the visible peaks are smaller). Rather than an abrupt jump from one waveform shape to another, Framer Motion can be leveraged to transition seamlessly. This subtle animation provides clear visual feedback that an updated, potentially “thinner” or “smaller” waveform is now playing.
In the implementation, the motion.path
component from Framer Motion listens for changes in the SVG path data. When the application computes a new d
attribute—reflecting lower peaks after filtering out frequencies—the animation engine interpolates between the old shape and the new one over a short duration:
<motion.path
d={filledPath}
fill="black"
animate={{ d: filledPath }}
transition={{ duration: 0.5, ease: "easeInOut" }}
/>
<motion.path
d={waveformPath}
className="fill-none stroke-black"
animate={{ d: waveformPath }}
transition={{ duration: 0.5, ease: "easeInOut" }}
/>
Here, filledPath
and waveformPath
are updated once the audio context finishes decoding the newly processed track. If you ’ve effectively removed certain signals, you’ll notice the path bounding the “empty” areas in the wave is noticeably shallower, signifying reduced amplitude. Framer Motion’s interpolation fosters a polished visual experience by smoothly morphing between states, ensuring users can track how the audio’s shape (and thus its timbre) has changed over time.
An integral feature is the smooth transition between the original and processed audio tracks. Rather than abruptly switching playback, we rely on a React state variable (e.g., oldStartTime
) to track how long the original track has been playing. By calculating the playback offset, we allow the new track to resume from the correct position—particularly useful if the audio is looping. Once the newly processed file is fully loaded and decoded, the application gradually fades out the old source while fading in the new one. Because the filter effect is now rendered into the server-side file, we can safely disconnect the JavaScript filter node during this process.
// 1. Decode the new audio file first
const responseAudioData = await fetch(newAudioUrl).then(res => res.arrayBuffer());
const newBuffer = await audioContext.decodeAudioData(responseAudioData);
// 2. Determine how far the old track has been playing, based on oldStartTime
const timeSinceStart = audioContext.currentTime - oldStartTime;
const oldBufferDuration = sourceNode?.buffer?.duration || 0;
const playbackOffset = timeSinceStart % oldBufferDuration;
// 3. Set up and start the new source at the correct offset (mute at first)
newSource.buffer = newBuffer;
newSource.loop = true;
newGainNode.gain.setValueAtTime(0, audioContext.currentTime);
newSource.start(audioContext.currentTime, playbackOffset);
// 4. Crossfade: fade out the old track, fade in the new
oldGainNode.gain.linearRampToValueAtTime(0, audioContext.currentTime + 1);
newGainNode.gain.linearRampToValueAtTime(1, audioContext.currentTime + 1);
This workflow maintains continuity in playback by matching the new track’s position to the exact point in time where the original track left off. Once the crossfade concludes, the old source is stopped, and redundant filter nodes are removed, thereby keeping the audio pipeline clean, efficient, and free of unwanted artifacts.
All audio files are hosted on Supabase, which provides a convenient way to store and retrieve files. The Express server downloads the requested file from the audio-files
bucket and, after processing, uploads a new version with the relevant filter applied. Once the server finishes its work, a public URL is generated and returned to the frontend.
Meanwhile, the browser relies on a single AudioContext, which prevents glitches that often occur when reinitializing new contexts. This architecture not only keeps the frontend lightweight but also simplifies the process of introducing future enhancements, such as more advanced effects or multi-track playback.